AITopics | chest x-ray

Collaborating Authors

chest x-ray

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

b139aeda1c2914e3b579aafd3ceeb1bd-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 21:01:10 GMT

alidation result, constraint, dataset, (16 more...)

Neural Information Processing Systems

Industry:

Health & Medicine (0.48)
Banking & Finance (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

MetaChest: Generalized few-shot learning of pathologies from chest X-rays

Montalvo-Lezama, Berenice, Fuentes-Pineda, Gibran

arXiv.org Artificial IntelligenceDec-9-2025

The limited availability of annotated data presents a major challenge for applying deep learning methods to medical image analysis. Few-shot learning methods aim to recognize new classes from only a small number of labeled examples. These methods are typically studied under the standard few-shot learning setting, where all classes in a task are new. However, medical applications such as pathology classification from chest X-rays often require learning new classes while simultaneously leveraging knowledge of previously known ones, a scenario more closely aligned with generalized few-shot classification. Despite its practical relevance, few-shot learning has been scarcely studied in this context. In this work, we present MetaChest, a large-scale dataset of 479,215 chest X-rays collected from four public databases. MetaChest includes a meta-set partition specifically designed for standard few-shot classification, as well as an algorithm for generating multi-label episodes. We conduct extensive experiments evaluating both a standard transfer learning approach and an extension of ProtoNet across a wide range of few-shot multi-label classification tasks. Our results demonstrate that increasing the number of classes per episode and the number of training examples per class improves classification performance. Notably, the transfer learning approach consistently outperforms the ProtoNet extension, despite not being tailored for few-shot learning. We also show that higher-resolution images improve accuracy at the cost of additional computation, while efficient model architectures achieve comparable performance to larger models with significantly reduced resource requirements.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.2559

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MedImageInsight for Thoracic Cavity Health Classification from Chest X-rays

Boya, Rama Krishna, Magalanadu, Mohan Kireeti, Palavalli, Azaruddin, Tekuri, Rupa Ganesh, Pattanayak, Amrit, Enuga, Prasanthi, Muthu, Vignesh Esakki, Boya, Vivek Aditya

arXiv.org Artificial IntelligenceNov-24-2025

Chest radiography remains one of the most widely used imaging modalities for thoracic diagnosis, yet increasing imaging volumes and radiologist workload continue to challenge timely interpretation. In this work, we investigate the use of MedImageInsight, a medical imaging foundational model, for automated binary classification of chest X-rays into Normal and Abnormal categories. Two approaches were evaluated: (1) fine-tuning MedImageInsight for end-to-end classification, and (2) employing the model as a feature extractor for a transfer learning pipeline using traditional machine learning classifiers. Experiments were conducted using a combination of the ChestX-ray14 dataset and real-world clinical data sourced from partner hospitals. The fine-tuned classifier achieved the highest performance, with an ROC-AUC of 0.888 and superior calibration compared to the transfer learning models, demonstrating performance comparable to established architectures such as CheXNet. These results highlight the effectiveness of foundational medical imaging models in reducing task-specific training requirements while maintaining diagnostic reliability. The system is designed for integration into web-based and hospital PACS workflows to support triage and reduce radiologist burden. Future work will extend the model to multi-label pathology classification to provide preliminary diagnostic interpretation in clinical environments.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.17043

Country: Europe > United Kingdom (0.14)

Genre: Research Report > Experimental Study (0.50)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)

Add feedback

CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Lee, Hyungyung, Choi, Geon, Lee, Jung-Oh, Yoon, Hangyul, Hong, Hyuk Gi, Choi, Edward

arXiv.org Artificial IntelligenceOct-28-2025

Recent progress in Large Vision-Language Models (LVLMs) has enabled promising applications in medical tasks, such as report generation and visual question answering. However, existing benchmarks focus mainly on the final diagnostic answer, offering limited insight into whether models engage in clinically meaningful reasoning. To address this, we present CheXStruct and CXReasonBench, a structured pipeline and benchmark built on the publicly available MIMIC-CXR-JPG dataset. CheXStruct automatically derives a sequence of intermediate reasoning steps directly from chest X-rays, such as segmenting anatomical regions, deriving anatomical landmarks and diagnostic measurements, computing diagnostic indices, and applying clinical thresholds. CXReasonBench leverages this pipeline to evaluate whether models can perform clinically valid reasoning steps and to what extent they can learn from structured guidance, enabling fine-grained and transparent assessment of diagnostic reasoning. The benchmark comprises 18,988 QA pairs across 12 diagnostic tasks and 1,200 cases, each paired with up to 4 visual inputs, and supports multi-path, multi-stage evaluation including visual grounding via anatomical region selection and diagnostic measurements. Even the strongest of 12 evaluated LVLMs struggle with structured reasoning and generalization, often failing to link abstract knowledge with anatomically grounded visual interpretation. The code is available at https://github.com/ttumyche/CXReasonBench

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.18087

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Evaluating ChatGPT's Performance in Classifying Pneumonia from Chest X-Ray Images

Prahallad, Pragna, Prahallad, Pranathi

arXiv.org Artificial IntelligenceOct-28-2025

In this study, we evaluate the ability of OpenAI's gpt-4o model to classify chest X-ray images as either NORMAL or PNEUMONIA in a zero-shot setting, without any prior fine-tuning. A balanced test set of 400 images (200 from each class) was used to assess performance across four distinct prompt designs, ranging from minimal instructions to detailed, reasoning-based prompts. The results indicate that concise, feature-focused prompts achieved the highest classification accuracy of 74\%, whereas reasoning-oriented prompts resulted in lower performance. These findings highlight that while ChatGPT exhibits emerging potential for medical image interpretation, its diagnostic reliability remains limited. Continued advances in visual reasoning and domain-specific adaptation are required before such models can be safely applied in clinical practice.

large language model, machine learning, pneumonia, (18 more...)

arXiv.org Artificial Intelligence

2510.21839

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

b139aeda1c2914e3b579aafd3ceeb1bd-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 20:42:16 GMT

constraint, dataset, validation result, (16 more...)

Neural Information Processing Systems

Industry:

Health & Medicine (0.48)
Banking & Finance (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

Perceptual Evaluation of GANs and Diffusion Models for Generating X-rays

Schuit, Gregory, Parra, Denis, Besa, Cecilia

arXiv.org Artificial IntelligenceAug-12-2025

Generative image models have achieved remarkable progress in both natural and medical imaging. In the medical context, these techniques offer a potential solution to data scarcity--especially for low-prevalence anomalies that impair the performance of AI-driven diagnostic and segmentation tools. However, questions remain regarding the fidelity and clinical utility of synthetic images, since poor generation quality can undermine model generalizability and trust. In this study, we evaluate the effectiveness of state-of-the-art generative models--Generative Adversarial Networks (GANs) and Diffusion Models (DMs)--for synthesizing chest X-rays conditioned on four abnormalities: Atelectasis (AT), Lung Opacity (LO), Pleural Effusion (PE), and Enlarged Cardiac Silhouette (ECS). Using a benchmark composed of real images from the MIMIC-CXR dataset and synthetic images from both GANs and DMs, we conducted a reader study with three radiologists of varied experience. Participants were asked to distinguish real from synthetic images and assess the consistency between visual features and the target abnormality. Our results show that while DMs generate more visually realistic images overall, GANs can report better accuracy for specific conditions, such as absence of ECS. We further identify visual cues radiologists use to detect synthetic images, offering insights into the perceptual gaps in current models. These findings underscore the complementary strengths of GANs and DMs and point to the need for further refinement to ensure generative models can reliably augment training datasets for AI diagnostic systems.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.07128

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities

Das, Anindya Bijoy, Sakib, Shahnewaz Karim, Ahmed, Shibbir

arXiv.org Artificial IntelligenceAug-12-2025

Large Language Models (LLMs) are increasingly applied to medical imaging tasks, including image interpretation and synthetic image generation. However, these models often produce hallucinations, which are confident but incorrect outputs that can mislead clinical decisions. This study examines hallucinations in two directions: image to text, where LLMs generate reports from X-ray, CT, or MRI scans, and text to image, where models create medical images from clinical prompts. We analyze errors such as factual inconsistencies and anatomical inaccuracies, evaluating outputs using expert informed criteria across imaging modalities. Our findings reveal common patterns of hallucination in both interpretive and generative tasks, with implications for clinical reliability. We also discuss factors contributing to these failures, including model architecture and training data. By systematically studying both image understanding and generation, this work provides insights into improving the safety and trustworthiness of LLM driven medical imaging systems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.07031

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models

Sharma, Sonali, Alaa, Ahmed M., Daneshjou, Roxana

arXiv.org Artificial IntelligenceJul-14-2025

Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and authoritative, disclaimers must be implemented as a safeguard adapting to the clinical context of each output.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.0803

Country: North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report > New Finding (0.94)

Industry:

Health & Medicine > Therapeutic Area (0.95)
Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)

Add feedback

Filters

Collaborating Authors

chest x-ray

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

b139aeda1c2914e3b579aafd3ceeb1bd-Supplemental.pdf

MetaChest: Generalized few-shot learning of pathologies from chest X-rays

MedImageInsight for Thoracic Cavity Health Classification from Chest X-rays

CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Evaluating ChatGPT's Performance in Classifying Pneumonia from Chest X-Ray Images

aec33ab89b5986605cd7c331396e7e5c-Supplemental-Datasets_and_Benchmarks_Track.pdf

b139aeda1c2914e3b579aafd3ceeb1bd-Supplemental.pdf

Perceptual Evaluation of GANs and Diffusion Models for Generating X-rays

Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities

A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models